ct: use compaction scheduling group#29492
ct: use compaction scheduling group#29492WillemKauf wants to merge 4 commits intoredpanda-data:devfrom
ct: use compaction scheduling group#29492Conversation
There was a problem hiding this comment.
Pull request overview
This PR integrates the compaction scheduling group into cloud topics to enable better resource management for compaction operations. The changes add functionality to compute the compaction backlog in bytes for cloud topics and hook it into the existing compaction_controller PID system.
Changes:
- Extended the
compaction_controllerto accept a callback function for computing cloud topics compaction backlog - Added
dirty_bytesfield to compaction metadata structures to track the size of dirty data - Modified cloud topics compaction workers to use the
compactionscheduling group for CPU resource allocation
Reviewed changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
src/v/storage/compaction_controller.h |
Added backlog_fn callback parameter to support cloud topics backlog computation |
src/v/storage/compaction_controller.cc |
Implemented cloud topics backlog aggregation via cross-shard submission to shard 0 |
src/v/redpanda/application_services.cc |
Wired up cloud topics backlog callback when initializing compaction controller |
src/v/cloud_topics/level_one/metastore/simple_metastore.h |
Introduced dirty_stats struct to return both dirty ratio and bytes |
src/v/cloud_topics/level_one/metastore/simple_metastore.cc |
Refactored dirty ratio calculation to compute and return dirty bytes |
src/v/cloud_topics/level_one/metastore/rpc_types.h |
Added dirty_bytes field to RPC types with version bump |
src/v/cloud_topics/level_one/metastore/replicated_metastore.cc |
Updated to propagate dirty_bytes field from RPC responses |
src/v/cloud_topics/level_one/metastore/metastore.h |
Added dirty_bytes to compaction_info_response structure |
src/v/cloud_topics/level_one/domain/simple_domain_manager.cc |
Propagated dirty_bytes in domain manager compaction info responses |
src/v/cloud_topics/level_one/domain/db_domain_manager.cc |
Propagated dirty_bytes in database domain manager |
src/v/cloud_topics/level_one/compaction/worker.h |
Added scheduling group member to compaction worker |
src/v/cloud_topics/level_one/compaction/worker.cc |
Modified worker loop to execute within the compaction scheduling group |
src/v/cloud_topics/level_one/compaction/worker_manager.cc |
Updated to pass compaction scheduling group to workers |
src/v/cloud_topics/level_one/compaction/scheduler.h |
Added compaction_backlog() method declaration |
src/v/cloud_topics/level_one/compaction/scheduler.cc |
Implemented backlog computation summing dirty bytes from queued logs |
src/v/cloud_topics/app.h |
Added public compaction_backlog() method to app interface |
src/v/cloud_topics/app.cc |
Implemented app-level backlog accessor delegating to scheduler |
| Test files | Updated test assertions to verify dirty_bytes computation and updated test fixtures with new parameters |
| auto fn = _cloud_topics_backlog; | ||
| cloud_backlog = co_await ss::smp::submit_to(0, std::move(fn)); |
There was a problem hiding this comment.
The local copy of _cloud_topics_backlog into fn is unnecessary. The std::move on line 27 will transfer ownership of the copy, not the member variable. Consider directly moving _cloud_topics_backlog into submit_to without the intermediate variable.
| log_ptr->info_and_ts->info.dirty_bytes); | ||
| } | ||
| } | ||
| return total / static_cast<int64_t>(ss::smp::count); |
There was a problem hiding this comment.
Dividing by the number of shards may result in loss of precision and underreporting the backlog when total is not evenly divisible by shard count. Consider returning the total backlog without division, or document why per-shard reporting is necessary and how rounding affects the PID controller behavior.
| return total / static_cast<int64_t>(ss::smp::count); | |
| auto shards = static_cast<int64_t>(ss::smp::count); | |
| if (shards <= 1) { | |
| return total; | |
| } | |
| // Use ceiling division to avoid systematic under-reporting of backlog | |
| return (total + shards - 1) / shards; |
8d2bc63 to
2978c2e
Compare
CI test resultstest results on build#79972
|
|
Nevermind, looks like we may be going with #29288. |
Hooks the
compactionscheduling group into cloud topics, and adds functionality to compute the backlog of compacted logs in cloud topics along with hooking it into thecompaction_controllerPID to control the number of shares allocated to thecompactionscheduling group.Backports Required
Release Notes